Escorted Free Energy Simulations 



Suriyanarayanan Vaikuntanathan^ and Christopher Jarzynski^'^ 
^ Chemical Physics Program, Institute for Physical Science and Technology, 
University of Maryland, College Park, MD 20742 

^Department of Chemistry and Biochemistry, 
University of Maryland, College Park, MD 20742 

We describe a strategy to improve the efhciency of free energy estimates by reducing dis- 
sipation in nonequiUbrium Monte Carlo simulations. This strategy generalizes the targeted 
free energy perturbation approach [Phys. Rev. E. 65, 046122, 2002] to nonequilibrium 
switching simulations, and involves generating artificial, "escorted" trajectories by coupling 
the evolution of the system to updates in external work parameter. Our central results are: 
(1) a generalized fluctuation theorem for the escorted trajectories, and (2) estimators for the 
free energy difference IS.F in terms of these trajectories. We illustrate the method and its 
effectiveness on model systems. 
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I. INTRODUCTION 

The computation of free energy differences is an essential component of computer studies of 
biological, chemical, and molecular processes, with applications to topics such as phase coexistence 
and phase equilibria, ligand binding events, and solvation of small molecules 1, 3]- Given the 
importance of free energy calculations in computational thermodynamics, there is a need for robust, 
efficient and accurate methods to estimate free energy differences. 

In a standard formulation of the free energy estimation problem, we consider two equilibrium 
states of a system, corresponding to the same temperature T but different values of an external 
parameter, X = A,B, and we are interested in the free energy difference between the two states, 
AF = Fb — Fa- While many widely used free energy estimation methods, such as thermodynamic 
integration and free energy perturbation rely on equilibrium sampling, there has been considerable 
interest in methods for estimating AF that make use of nonequilibrium simulations . In the 
most direct implementation of this approach, a number of independent simulations are performed in 
which the external parameter is varied at a finite rate from \ = Ato \ = with initial conditions 
sampled from the equilibrium state A. The free energy difference AF can then be estimated using 
the nonequilibrium work relation ^\^\ 

where W denotes the work performed on the system during a particular realization (i.e. simulation) 
of the process, angular brackets (...) denote an average over the realizations of the process and 
/3 = 1 /T. In principle, this approach allows one to compute AF from trajectories of arbitrarily short 
duration. However, the number of realizations required to obtain a reliable estimate of AF grows 
rapidly with the dissipation, (Wdiss) = (W) — ^F-, that accompanies fast switching simulations 0- 
Q| • The dissipation is positive as a consequence of the second law of thermodynamics, and reflects 
the lag that builds up as the system pursues ~ but is unable to keep pace with - the equilibrium 
distribution corresponding to the continuously changing parameter A 8l4lll] . This idea is illustrated 
schematically in Fig[TJ 

InRef. Q, we described a strategy to improve the efficiency of free energy estimates obtained 
with nonequilibrium molecular-dynamics simulations. This strategy involved adding non-physical 
terms to the equations of motion, to reduce the lag and therefore the dissipation. As illustrated in 
Ref. [3] using a simple model system, when these terms successfully "escorted" the system through 
a near-equilibrium sequence of states, the convergence of the free energy estimate improved dra- 
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FIG. 1. The axes schematically represent configuration space (z-spacc). The unshaded ovals denote 
the statistical state of the system, p(z,t), and the shaded ovals denote the equilibrium state, peq*'\z), 
corresponding to the value of external parameter, A(t), at various instants of time. As the work parameter 
A is switched from ^ to _B, a lag builds up as the state of the system, p(z,i), pursues the equilibrium 
distribution corresponding to the changing work parameter, peq*\z). 



matically. In the present paper we extend these results to simulations evolving according to Monte 
Carlo dynamics. We then show that the escorted trajectories satisfy a fluctuation theorem, and 
we discuss and illustrate the application of this result to the estimation of free energy differences. 

In Section [n] we introduce escorted nonequilibrium switching simulations for systems evol ving 
according to Monte Carlo dynamics. The approach we take here is motivated by previous work [12- 



151 ] and involves generating artificial, or "escorted", trajectories, Eq. [TOl by modifying the dynamics 
with terms that directly couple the evolution of the system to changes in the external parameter. 
The central result of this section is an identity for AF in terms of these escorted trajectories, Eq. 
[T9l In Section IIIII we extend this result by showing that these trajectories satisfy a fluctuation 



It 4l8t|. This in turn allows us to combine our 
19( 1 which provides an optimal, asymptotically 



relation analogous to Crooks's fluctuation relation 
approach with Bennett's acceptance ratio method 
unbiased estimator, Eq. [38l for AF In Section ITVl we show that while Eqs. [T9l and [381 are 
identities for all escorted simulations, they are particularly effective as estimators of AF when 
the modified dynamics successfully reduce the lag described above. In particular, if these terms 
eliminate the lag entirely, then Eqs. [19] and [38] provide perfect (zero variance) estimators: W = 
AF for every realization. Finally in Section [Vj we illustrate the effectiveness of our approach on 
two model systems. 
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II. ESCORTED NONEQUILIBRIUM SIMULATIONS 

Consider a system whose energy is given by a classical hamiltonian, Hx{z), where z denotes a 
microstate, that is a point in the D-dimensional configuration space of the system, [2l| and A is an 
external work parameter. At a temperature the equilibrium state of this system is described 
by the distribution 



„A 



(2) 



with the free energy Fx = — hi Zx- We wish to compute the free energy difference AF = Fb—Fa 
between two equilibrium states at the same temperature, but different values of the work 

parameter, \ = A^B. 

To estimate the value of AF, we assume we have at our disposal a discrete-time Monte Carlo 
algorithm, parametrized by the value of A and defined by the transition probability Pa(z|zo): if zq 
represents the microstate of the system at one time step, then the next microstate z is sampled 
randomly from Px(z|zo). We assume this algorithm satisfies the conditions of detailed balance, 

Pa(z|zo) _ e-^^^(^) 



Pa(zo|z) e-/^^A(zo) 



and ergodicity 



22l |. Routinely used Monte Carlo schemes such as the Metropolis algorithm 



(3) 

Q 



satisfy these conditions. Eq. [3] implies the somewhat weaker condition of balance, 

dzo Pa(z|zo) e-^-^^(^o) = e-'^^^(^) (4) 



which we will use in the analysis below. With this Monte Carlo algorithm in place, we first describe 
a standard procedure for estimating AF using nonequilibrium simulations, Eqs. [5][9] below, and 
then we introduce our modified version of this approach. 

Imagine a process in which the system is initially prepared in equilibrium, at A = j4 and 
temperature /3~^, and then the system evolves under the Monte Carlo dynamics described above, 
as the value of A is switched from A to S in steps according to some pre-determined protocol. 
This evolution generates a trajectory 7 = {zq, zi, . . . , zjv-i} that can be represented in more detail 
using the notation 

[zo,Ao] =^ [zo,Ai] [zi,Ai] =^ > [zAr_i, Aat-i] =^ [zn-i,Xn]- (5) 



Here, the symbol =^ denotes an update in the value of A, with the microstate held fixed, while 
denotes a Monte Carlo step at fixed A, e.g. the microstate zi is sampled from the distribution 
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Pai(zi|zo). Moreover, 

Xo = A , \n = B, (6) 

and the initial point zq is sampled from p^^{zq). 

Because it is specified by the sequence of microstates zo,---Z7v-i, the trajectory 7 can be 
viewed as a point in a DA^-dimensional trajectory space, with d'y = dzQ ■ ■ ■ dz^^i- For the process 
described in the previous paragraph, the probability density for generating this trajectory is 

P[l] = Ajv-i(ziV-l|z7V-2)--- A2(z2|zi)^'Ai(zi|zo)p;^5(zo) (7) 

where the factors Px.(zj|zj_i) in this equation (read from right to left) correspond to the symbols 
— )■ in Eq. [5] (read from left to right). The work 



sum of energy changes due to updates in A, 



per: ormec on the system during this process is the 
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i=N-l i=N-l 

Wh]= ^ SW,^ Yl [Hx^^d^^) - H^^iz,)] . (8) 

1=0 i=0 

Using Eqs. [3l[7]and[8l we arrive at the nonequilibrium work relation for Monte Carlo dynamics 
1 



(e-/3^) = / dTpMe-^'^W = e-^^r (9) 



Thus we can estimate AF by repeatedly performing simulations to generate trajectories of the 
sort described by Eq. [5l computing the work associated with each trajectory, Eq. [HI and finally 
constructing the exponential average, Eq. [9l As mentioned in the Introduction, however, this 
average converges poorly when the process is highly dissipative. 

To address the issue of poor convergence, let us now assume that for every integer < i < A^, we 
have a deterministic function M, : z — > z' that takes any point z in configuration space and maps 
it to a point z'. We assume that each of these functions is invertible {M~^ exists), but otherwise 
the functions are arbitrary. These Mj's then constitute a set of bijective mappings, which we use to 
modify the procedure for generating trajectories, as follows. When the value of the work parameter 
is switched from Aj to Aj+i, the configuration space coordinates are simultaneously subjected to 
the mapping Mj. Eq. [5] then becomes 

[zo, Ao] ^ [zq, Ai] [zi, Ai] [zat-i, Aat^i] [z^_i, Aat] (10) 

where 

< = Mi{z,), (11) 
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as indicated by the notation (As before, the symbol — )■ denotes a Monte Carlo move at 
fixed A.) The bijective maps effectively escort the system by directly coupling increments in A to 
changes in the microstate. This is similar to the "metric scaling" approach introduced by Miller 
and Reinhardt [3], in which each update in A is accompanied by a linear scaling of coordinates; 
however, in the present paper we do not assume the Mj's are linear in z. 

In the escorted trajectory (Eq. [T0|) . the system visits a sequence of 2N points in configuration 
space: the N "primary" microstates zq, • • • z^v-i, alternating with the N "secondary" microstates 
Zq, • • • z'^_]^. Since each z^ is uniquely determined from Zj (Eq. [TT]l . the sequence of primary 
microstates 7 = {zq, • • • ZAr_i} fully specifies the trajectory; that is, trajectory space remains DN- 
dimensional, with d^y = dzQ ■ ■ ■ dzN-i. The probability density for generating a trajectory 7 is 
given by the following modification of Eq. [71 

Pb] = PxN-ii^N-l\z'N-2) ■ ■ ■ Px2i^2\2i) Px,izi\z'o) p^gizo) (12) 

Taking a cue from Refs Q,[l3], let us now define 

7V-1 N-1 

W'h] = Y,^Wl^Yl [Hx^^A^'d - HxM) - r'inMz,)] (13) 

i=0 i=0 

where Jj(z) = \dz'/dz\ is the Jacobian associated with the map Mj : z — > z'. Averaging 
exp(— /5T4^'[7]) over the ensemble of trajectories, we have 



(e-/3W^') = yd7p[7]e-^'^'W 
= J^I dzN^i--- 1 dzoe-^S»=o''^^/PA^_^(z^_i|z'^„2)---^'A,(ziK)e-/^^^o(-o) (14) 

To evaluate this expression, we first identify all factors in the integrand that do not depend on zg 
or Zq, and we pull these outside the innermost integral, / dzQ, which gives us (for that integral): 

J dzoe-^''^op^^(zi|z[))e-'^^^o(^o) (15) 
= J dzo Jo(zo)PAi(zi|z[,)e-^^^i(^o) (16) 
= J dz'o Pai (zi \z'o) e-^^^i (^0) = e-Z^^M (-1) (17) 

We have used Eq. [13] to get to the second line, followed by a change in the variables of integration 
to get to the third line, dzQ Jo(zo) dz^, and we have invoked Eq. [Uto arrive at the final result. 
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This process can be repeated for the integrals / dzi to J dziy^2, which brings us to: 

dzN-i J7v-i(zjv-i) e-^^^Ni^N-i) (18) 



Z 



Ao 



^Ao J ^Ao 

and therefore 

(e-/3^') = e-/^^^ (19) 

Eq. [19] is an identity for AF in terms of escorted trajectories, generated as per Eq. [TOl For 
the special case in which each mapping is the identity, Mi = I, we recover the usual scheme, 
Eq. [SI and then Eq. [19] reduces to the nonequilibrium work relation, Eq. [9] Following Miller and 
Reinhardt [l^, we will find it convenient to interpret W as the work done during the switching 
process and simply denote it by W. As we will discuss in Section IIVI below, when the mappings 
{Mi} are chosen so as to reduce the dynamic lag illustrated in Fig. [T] then the efficiency of the 
estimate of AF improves, often dramatically. 



III. FLUCTUATION THEOREM 

Let us now consider not only the switching process described by Eq. llOl which we will henceforth 
designate the forward process, but also its time-reversed analogue, the reverse process. In the 
reverse process, the system is prepared in equilibrium at A = and temperature The work 
parameter is then switched to A = A in steps, following a sequence {Aq, Ai, • • • , Atv} that is the 
reversal of the protocol used during the forward process: 

Ai = XN-i (20) 

During the reverse process, changes in A are coupled to the system's evolution through the inverse 
mapping functions. Mi = M^^_-^^_-, generating a trajectory 

[z'jv„i, Xn] ^^<=^ [zN~i, Xn~i] < [zi, Ai] ^ [z'o, Ai] [zq, Aq] (21) 

where z^ = Mi(zi), and the initial state zq is sampled from p^g. The direction of the arrows indicates 
the progression of time. The probability density for obtaining a trajectory 7 = {zo,zi, . . . ,Z7v-i} 
is 

Pi'y] = Pxj,.S^n-i\z'n-2), ■ ■ ■ P~x,ii2\i'i) P~x,{iM 4(zo) (22) 
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with d'y = dzQ ■ ■ ■ d^N-i- Following Eq. [T3l the work performed during this process is 

WnYy] = [h-,^Jz'^ - H-,^{ii) - r'lnMi,)] , (23) 

where Ji(z) = \dz' /dz\ is the Jacobian for the mapping Mj. Here and below we use the subscripts 
F and R to specify the forward and reverse processes, respectively. 

We will now show that the work distributions corresponding to these two processes satisfy 



Crooks's fluctuation relation, 



16h18|] namely 

^^(^) - e/3(W^-AF) (OA) 



where 



Pf{W) = J djpFb]6{W-WFb]) (25) 

denotes the distribution of work values for the forward process, and Pr{W) is similarly defined for 
the reverse process. 

To establish this result, consider a conjugate pair of trajectories, 7 and 7*, related by time- 
reversal. Specifically, if 7 = {zq, • • • zn-i}f is a trajectory generated during the forward process, 
that visits the sequence of microstates 

Zo => Zq ^> Zi ^ Zi ^> • • • -> ZN-1 Z^_i , (26) 

then its conjugate twin, 7* = {z'^_^, • • • Zq}/j, generated during the reverse process, visits the same 
microstates, in reverse order: 

A^iV-l / / Mo / /r,-\ 

Zo ^ Zq ^ Zi <^ Z^< ^ ZAr_i <= Z^„i (27) 

that is Zj = z^_j^_j and z^ = zjv„i_j (see Eq. [2T]) . Note that the primary microstates of 7 are the 
secondary microstates of 7*, and vice- versa, and the work function is odd under time-reversal: 

WfIj] = -Wr[j*]. (28) 

We wish to evaluate the quantity 

Pf{W) e-^(^-^^) = j djpFh] e-/5(^^H-^^) 6{W - Wf^]) (29) 
with pf[7] given by Eq. [T2j To this end, we first decompose Wi?[7] as follows: 

Wf [7] = ^Ef [7] - Qf [7] -T^Sf [7] , (30) 
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where 



Ai?^[7]^i/A^.(z';v-i)-^A„(zo) 

N-1 

QFh]^^[HxM)-HxM~,)] 



i=l 
N-1 



N-1 



5^[7]^ 5]lnJA,(z,) = lnn 



i=0 



i=0 



dz' 



dz 



In 



57* 



^7 



(31a) 
(31b) 

(31c) 



Here Aii^i?[7] is the total change in the energy of the system as it evolves along the trajectory 7, 
QfM can be interpreted as the heat transferee! to the system from the reservoir [l^, and >S'i?[7] 
is an entropy-like term, which arises because the mappings Mj need not preserve volume. The 
quantities defined in Eq. [31] satisfy the properties 



PXm-1 i^N-1 Wn-2) ■■■P\i (Zl |Zo) = Aat^i (Z^„2 |Z7V-1) ' ' ' (ZqIzi) 6 



-/3<3f[7] 



where we have used Eqs. [2]and[3l These properties then give us 



/3(A£;p[7]-AF) 



(32a) 
(32b) 



PFh] = PA^^_l(zAf-l|z^_2)•••^Al(zl|Zo)Pe•'(zO 



PXn-i (z'Ar-2|z7V-l) ■ ■ ■ Px, (Zo|zi) 6 

xp^,-(zW_i)e/^(^^H7l-AF) 
-PRb*] e' 



(33) 



,*1 p/3(W^f[7]-AF) ^Sph] 



hence 



PFb] e 



-ISilVphj-AF) 



■PRb* 



(34) 



Substituting this result into the integrand on the right side of Eq. [29l then changing the variables 
of integration from to dj*, and invoking Eq. [28| we finally arrive at the result we set out to 
establish: 



(35) 



Eq. [35] in turn implies that the average of any function f{W) over work values generated in the 
forward process, can be related to an average over work values obtained in the reverse process: [l^ 

{fiW))F 



-/3AF 



(36) 



{f{-W)e-^W)R 

In principle, this result can be used with any f{W) to estimate AF. The problem of determining 
the optimal choice of f{W) was solved by Bennett in the context of equilibrium sampling, 19|] and 
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this solution can be applied directly to the nonequilibrium setting. Specifically, if we have 

np work values from the forward simulation, and nji work values from the reverse simulation, then 
the optimal choice is 

^<"'> = l + eMPW + m ^'"^ 

where K = —AF + /3~^ ln(np/nfi). The value of AF is then estimated by recursively solving the 
equation, 

(l/(l + e/^(^-^)))^" 

as described in detail in Ref. [3]. This procedure for estimating AF is known as Bennett's 
Acceptance Ratio method (BAR). 

IV. COMPUTATIONAL EFFICIENCY AND FIGURES OF MERIT 

While Eqs. [19] and [38] are valid for any set of invertible mapping functions, {Mi}, the efficiency of 
using escorted simulations to estimate AF depends strongly on the choice of these functions. Since 



the convergence of exponential averages such as Eq. [19] deteriorates rapidly with dissipation 
which in turn correlates with the lag illustrated in Fig. ([T]), it is reasonable to speculate that a 
choice of mappings that decreases the lag will improve the convergence of estimator (Eq. [T9|) . 

To pursue this idea, let us first consider the extreme case of a set of mapping functions {M*} that 
entirely eliminates the lag. By this we mean the following: for an ensemble of trajectories generated 
using Eq. [ini with zq sampled fromp^(zo), the subsequent microstates Zj are distributed according 
to Pei(zj), for all 1 < i < A^. That is, the shaded and unshaded ovals coincide in Fig. ([T]). This 
occurs if under the bijective mapping M* : z ^ z', the equilibrium distribution p^^{z) transforms 
to the distribution peq'^^{z') [3], in other words 

Per(-') = ^ (39) 

[Under a bijective map M : x — > y, a distribution /(x) is transformed to the distribution r]{y) = 
/(x)/J(x), where J(x) = |5y/5x|.] When all the Mj^ 's satisfy this condition, we will say that the 
set of mappings is perfect. Using p^^ = e^^^^~^^\ and taking the logarithm of both sides of Eq. 
we obtain (for a perfect set of mappings) 



5Wi ^ (z) - H^, (z) - In Jl (z) = F,^^, - F,, , 



(40) 
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hence W[y] = AF for every trajectory 7 (Eq. I13p . Thus for a perfect set of mappings we have 
Pf{W) = 5{W — AF), and Eq. [19] provides a zero- variance estimate of the free energy difference. It 
is straightforward to show that if the M*'s form a set of perfect mappings for the forward process, 
then the M*'s form a set of perfect mappings for the reverse process, and Pr{W) = 5{W + AF). 

The considerations of the previous paragraph support the idea that reducing lag improves 
convergence. While we generally cannot expect to be able to construct a perfect set of mapping 
functions (this is likely to be far more difficult than the original problem of estimating AFl [12]), 
in many cases it might be possible to use either intuition or prior information about a system to 
construct a set of Mj's that reduce the lag substantially. In such cases the dissipation accompanying 
the escorted simulations is less than that for the unescorted simulations, leading to improved 
convergence of the free energy estimate. 

As an example of a strategy that can be used to construct good mappings, consider a system 
of identical, mutually interacting particles, in an external potential Ux{r): 

Hx{z) = Uxirk) + y(^k,ri) (41) 

k k<l 

The probability distribution of a single, tagged particle is then given by the single-particle density 



P?ir) 



^ydz5[rfe(z)-r]e-'^^^(^) (42) 



where rfc(z) specifies the coordinates of the tagged particle as a function of the microstate z. Now 
consider a reference system of non-interacting particles, described by a Hamiltonian 

Hx{z) = J2Ux{rk) (43) 

k 

with a similarly defined single-particle density p^^\r); and imagine that Ux is chosen so that 
these single-particle densities are identical or nearly identical: p^^''(r) ~ p^^\r). In this case a set 
of mappings {Mj} that are perfect or near-perfect for the reference system (Hx), might be quite 
effective in reducing lag in the original system (Hx)- We will illustrate this mean- field-like approach 
in Section IV Bl and we note that a similar strategy was explored by Hahn and Then in the context 
of targeted free energy perturbation [l^ . 

It will be useful to develop a figure of merit, allowing us to compare the efficiency of our 
method for different sets of mappings. One approach would be simply to compare the error bars 
associated with the statistical fluctuations in the respective free energy estimates. Unfortunately, 
estimates of AF obtained from convex nonlinear averages such as the one obtained from Eq. \T9\ 
are systematically biased for any finite number of realizations QjIsS]. This bias can be large, and 
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as a result the statistical error bars by themselves might not be sufficiently reliable to quantify the 
efficiency of the mapping. In the following paragraphs we discuss alternative figures of merit. 

We begin by noting that when the unidirectional estimator, Eq. \T9\ is used in conjunction 
with simulations of the forward process, then the number of realizations (Ng) required to obtain a 
reliable estimate of AF is roughly given by 0, ^ 

where {W)r + AF is the dissipation accompanying the reverse process. While this provides some 
intuition for the convergence of Eq. [191 its usefulness as a figure of merit is somewhat limited as it 
requires simulations of both the forward and the reverse processes, and in that case we are better 
off using a bidirectional estimator such as Eq. [381 

When we do have simulations of both processes, then an easily computed figure of merit is the 
hysteresis, (Wdiss) F + {Wdiss)R = (W) p + (W) r. The value of this quantity is zero if the mappings 
are perfect, otherwise it is positive. It is interesting to note that the hysteresis can be related to 
an information-theoretic measure of overlap between the forward and reverse work distributions 



Pf{W) and Pr{-W): |26| 



D[Pf\\Pr] + D[Pr\\Pf] = P{{W)f + {W)r). (45) 

Here = J p\n{p/q) > denotes the relative entropy between the distributions p and q, and 

the symmetrized quantity -D[p||g] + ^[(zHp] (also known as the Jeffreys divergence 27(]) provides 
a measure of the difference, or more precisely the lack of overlap, between the distributions. The 
right side of Eq. [35] can be estimated from a modest sample of forward and reverse simulations. 
If a set of mappings reduces the hysteresis, (W) f + (W^)_R, then this indicates increased overlap 
between the work distributions, and therefore improved convergence al. 

nn 

When riF = fiR = Ng ^ 1, the mean square error of the Bennett estimator is |14l. I19I. 
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280 



((^51«-AF)=) = ^(^-1J. (46) 

Here F^^^j^ denotes the estimate of AF obtained from Eq. [38l and 

C - f dW ''^^^^''^^-^^ I ' \ I ^ \ U7) 

~J Pf{W) + Pr{-W) \l + exp[/3(l^-AF)]/^ \ 1 + exp[/3(I^ + AF)] ^ ^ 

(This result can be generalized to the case uf ^ ur [l^.) As discussed by Bennett [3] and 
Hahn and Then Q, 13], the value of C measures the overlap between Pf{W) and Pr{—W)^ and 
provides a rough figure of merit for the Bennett estimator. When lag is eliminated and the two 
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distributions coincide, then C attains its maximum value, C = 1/2, whereas when there is poor 
overlap, C ~ 0. Thus we expect that the higher the value of the overlap function C, the smaller 
the number of realizations Ng required to estimate AF from Eq. [38] with a prescribed accuracy. 
Indeed, Eq. |36] suggests a lower bound on the number of realizations needed to achieve a mean 
square error less than /3~^: Ns > 1/C. Note that since C is an ensemble average (Eq. [T7|l . it can 
readily be estimated from available simulation data. 

In the Appendix, we derive an upper bound on the number of realizations needed to obtain a 
reliable estimate of AF using Bennett's method, Ng (Eq. IA4p . Combining these bounds gives us 

^<Ns<-^ (48) 



29[, it does allow us to argue 



While Eq. HH] cannot be used to obtain a good estimate for A''^ 
heuristically that whenever a set of mappings succeeds in increasing the value of C, the convergence 
of the Bennett estimator is improved. We will illustrate this point in the following section. 



V. EXAMPLES 
A. Cavity Expansion 

As a first example, we estimate the free energy cost associated with growing a hard-sphere solute 
in a fluid. Consider a system composed of iip point particles inside a cubic container of volume 
L^, centered at the origin with periodic boundaries. The particles are excluded from a spherical 
region of radius R, also centered at the origin. The particles interact with one another via the 
WCA pairwise interaction potential [l| which is denoted by V{rk,ri). The energy of the system at 
a microstate z = (ri , r2 , . . . , r„p ) is given by 

k=l l>k 

where Q{z, R) = whenever |rfc| > R for all /c = 1, • • • Up, that is when there are no particles inside 
the spherical cavity; and @{z,R) = oo otherwise. The function 0(z,i?) ensures that particles are 
excluded from the spherical region around the origin. We wish to compute the free energy cost, 
AF, associated with increasing the radius of the cavity from Ra to Rb (See Fig. [2]). 

A hypothetical estimate of AF using unescorted nonequilibrium simulations (Eq. [S]) involves 
"growing out" the spherical cavity in discrete increments, as follows. Starting with a microstate 
zq sampled from equilibrium at R = Ra, the radius of the sphere is increased by an amount SRq. 
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FIG. 2. A schematic of the cavity expansion problem 



If all Up fluid particles remain outside the enlarged sphere, then 5Wq = 0; but if one or more 
particles now finds itself inside the sphere (r^ < Ra + SRq) then 6Wq = oo. One or more Monte 
Carlo steps are then taken, after which the radius is again increased by some amount, (5i?i, and 
6W\ is determined in the same fashion as 5Wq. In principle this continues until the radius of the 
sphere is Rb, and then the work is tallied for the entire trajectory: W = SWi. In practice the 
trajectory can be terminated as soon as 5Wi = cx) at some step i, since this implies W = oo. For 
this procedure, Eq. [T] can be rewritten as 



where P is the probability of generating a trajectory for which W = 0; that is, a trajectory in 
which the sphere is successfully grown out to radius Rb, without overtaking any fluid particles 
along the way. The quantity P is estimated directly, by generating a number of trajectories and 
counting the "successes" {W = 0). For a sufficiently dense fluid, however, a successful trajectory 
is a rare event {P <C 1), and this approach converges poorly. Note also that this approach does 
not give the correct free energy difference in the reverse case of a shrinking sphere (from R = Rb 
to R = Ra), since = for every trajectory in that situation. 

For the hypothetical procedure just described, Eq. [50] implies that the probability to generate 
a successful trajectory does not depend on the number of increments used to grow the cavity from 
Ra to Rb- Therefore the most computationally efficient implementation is to grow the sphere out 



P 




(50) 
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FIG. 3. Running estimate of P = cxp(— /3AF) from escorted free energy simulations, plotted as a function 
of the number of trajectories used to obtain the estimate. 

in a single step, which corresponds to the free energy perturbation method (FEP) [l|, Q]. In this 
case P is just the probability to observe no particles in the region Ra < r < Rb, for an equilibrium 
simulation at cavity radius Ra. 

To improve convergence by means of escorted simulations (Eq. fTO]) . we constructed mapping 
functions Mi that move the fluid particles out of the way of the growing sphere, to prevent infinite 
values of 5Wi. Specifically, as the cavity radius R is increased from Ri to -Rj+i, the location of the 
n*'* particle, r„, is mapped to r'„ = mj(r„), where [3| 



1 + 



R^L' 



8r' 



(L3 - 8Rf)rl 



1/3 



if rn < L/2 



(51) 



and mj(r„) = r„ if r„ > L/2. The notation rrii : Vn ^ denotes a single-particle mapping; the 
full mapping Mj : z ^> z' is obtained by applying rrii to all Up fluid particles. To picture the effect 
of this mapping, let Si denote the region of space defined by the conditions Ri < r < L/2, that 
is a spherical shell of inner radius Ri and outer radius L/2 (just touching the sides of the cubic 
container). Under the mapping mj : r — )> r', the shell Si is compressed uniformly onto the shell 



leaving the eight corners of the box r > L/2 untouched. 



301 1 In this manner, the particles 



that would otherwise have found themselves inside the enlarged sphere are pushed outside of it, 
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{W)f 


22.288± 0.012 


{W)r 


-14.458± 0.013 


{W)f + {W)r 


7.830± 0.018 


^pest 


18.487 ±0.085 


^pest 


-18.334 ±0.078 




18.456 ±0.011 


c 


0.120 ±0.001 



TABLE I. Estimates and figures of merit. Here AFp''^ denotes the estimate of AF = Fb — Fa from the 
forward process {Ra Rb) and AF^* denotes the estimate of ~AF from the reverse process {Ra ^ Rb)- 
AFg^jf denotes the estimate of AF obtained from Bennett's Acceptance Ratio method. 

resulting in a finite contribution to the work (Eq. [T3|) , 

np-l rip 
fc=l l>k 

where uq = nQ{z) is the number of particles found within the shell Ri < r < L/2 (before the 
mapping is applied), and 7 = {L"^ — 8Rf_^_l) / {L^ — 8Rf) < 1 is the ratio of shell volumes, 
The first term on the right side of Eq. 1521 gives the net change in the energy of the system associated 
with the escorted switch [zj, R^] =^ [z'-, while the second is the Jacobian term — /3~^ In Jj(zj). 

Unlike the unescorted approach or free energy perturbation, the escorted approach with the 
mapping given by Eq. [5T]is applicable in both the forward (growing spherical cavity) and reverse 
(shrinking cavity) directions. In the reverse direction, as the solute radius is decreased from Ri+i 
to Ri, the shell is uniformly expanded onto the shell Si. The corresponding increment in work 
is given by a formula similar to Eq. [52l As a result, one can combine work values from forward 
and reverse escorted simulations using Bennett's Acceptance Ratio (BAR), Eq. [351 

We have performed both forward and reverse simulations of this system using Up = 1000 WCA 
particles, with L = 10.42(7, Ra = 2.0a, Rb = 2.05a, and T* = ksT/e = 1, where the WCA 
parameters a and e set the units of length and energy, respectively. Minimum image convention 
and periodic boundary conditions were used 

Fig [3] shows a running estimate of P = exp(— /3AF) obtained from escorted simulations in 
which the solute radius was switched from Ra io Rb va. N = 10 steps, with each increment 
in R alternating with one Monte Carlo sweep. Using a total of Ng = 50000 independent escorted 
trajectories, estimates of AF and the figures of merit were obtained, and are summarized in TableU 
(The value of C and AF^'^^ were estimated using np = ur = Ng = 50000 trajectories). Statistical 
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3l|. While an analytical expression for AF 



error bars were computed using the bootstrap method 
is not available for this example, the agreement between the estimates obtained by growing the 
solute (F), shrinking it (R), and applying BAR gives us confidence in the result, AF w 18.4 ksT. 

As an additional consistency check, in Fig. |4]we verify that the escorted simulations satisfy the 
fluctuation theorem Eq. [35j We do this by following steps analogous to those in Section III of 
Ref. Q to obtain a restatement of Eq. 



L2iW)-Li{W) 



w 

lnPjii-W) + f3— 



w 

lnPFiW)-(3— 



/3AF 



(53) 



In FigHlwe plot Li, L2, and L2 — Li as functions of W. The flatness of the difference L2 — Li over 
the region for which we have good statistics is in agreement with Eq. [53l and provides a useful and 
stringent consistency check [1 



], which gives us further confidence in our estimates. 

While the highly accurate estimates listed in Table |T] were generated using Ng = 50000 escorted 
trajectories, we found that we were able to obtain estimates of AF with error bars around 1 ksT 
using only Ng = 100 realizations for the unidirectional estimators, and Ng = 10 realizations for the 
bidirectional estimator (data not shown). 

To compare the escorted method with unescorted free energy perturbation (FEP), we first 
sampled Ng = 100000 independent configurations from the canonical ensemble with cavity radius 
R = Ra, by generating a single, long equilibrium Monte Carlo trajectory and sampling one con- 
figuration per 10 Monte Carlo sweeps. This involved a total computational time approximately 
equal to that of generating 50000 escorted trajectories. Among these 10^ configurations we did 
not observe a single one in which the region Ra < r < Rb was spontaneously devoid of particles 
{W = 0), in other words we were unable to obtain an estimate of AF using free energy perturba- 



tion. This is consistent with the result P 



-18.4 



10"^ (Fig. El Tabled]), which suggests that 



roughly 10^ independent configurations are needed to observe one for which W = 0. 

For a more efficient implementation of FEP, we divided the interval [Ra_,Rb] into ten stages 
(sub-intervals), and then used FEP to estimate the free energy change for each stage, keeping the 
total computational time fixed. This provided a final estimate of AF with error bars comparable 
to those of the unidirectional escorted estimators in Table [U but still considerably larger than those 



of the bidirectional estimates (data not shown). 
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FIG. 4. Graphical verification of tlic fluctuation theorem and estimation of AF. The horizontal line indicates 
the estimate of AF obtained from the acceptance ratio method (Table |T]) . 

B. Dipole Fluid 

As our second example, we consider Up point Lennard-Jones dipoles in a cubic container of 
size L with periodic boundaries, and we compute the free energy cost associated with introducing 
a uniform electric field in the container. The energy of the system in an external electric field 
E = Ebz, where denotes a unit vector along the z-axis, is given by 



7-; 



Pfc • Pi 



(54) 



k=l k=l l>k l'^'^ ~ -"^'l 

where z = {ri, pi, . . . r„p, p^^}, p^ denotes the dipole moment vector of the k^^ particle, and 
V^j(r^., r;) denotes the Lennard-Jones pairwise interaction potential. The parameter 7 controls the 
strength of the dipole-dipole interaction. We set |pfc| = 1 for all k. In spherical polar coordinates, 
Pfc = {l,9k,(pk), and the measure on z space is hence dz = Il^^-^dri.dcos{9k)d(j)i.. 

Taking the electric field to be the external parameter, we wish to compute the free energy 
difference between the ensembles corresponding to E = and E = Ej at some temperature /3~^ 
by performing nonequilibrium switching simulations. Our first task is to construct a mapping 
function that escorts the system along a near equilibrium path as E is switched. Following Eq. H3l 
we consider the energy function He{z) = He,o{z) (i.e. 7 = in Eq. which describes a system 
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of non-interacting Lennard-Jones dipoles in a field of strength E. The change in free energy as the 
field is switched from Ei to -Ej+i can be solved analytically and is given by 



-n,iln 



sinh(/3E,+i) Ei 



(55) 



sinh(/3^i) Ei+i_ 

We now use this result to solve for a perfect set of mappings for this system of non-interacting 
dipoles. 

Let rrii : C, = cos{0) — )■ denote a mapping that acts on the C = cos{9) degree of freedom of a 
dipole when the external field is switched from Ei to -Bi+i- The full mapping Mi is obtained by 
applying the mapping rrii to all Up particles. We look for the perfect mapping Mi that transforms 
the canonical distribution corresponding to (z) to the canonical distribution corresponding to 
HEi^ii'2'')- The following equation for the perfect single particle mapping rrii can be obtained from 
Eq. Ho] by using Eqs. [Ml and [55] and by noting that • E = E^f^: 

Ei 



1 ^^ smh{pEi+i) . 
'/3 sinhi/S Ei) E,+i 



(56) 



This differential equation has the solution 

1 



rriiiC) 



■In 



sinh(/?g,H-i ) .gi;,:C ^gg. 
sinh(/3^i) ^ 



) + e 



(57) 



While Eq. [57] is a perfect mapping only when there are no dipole-dipole interactions (7 = 0) we 
expect this mapping to work reasonably well for small values of 7. We will use the term simple 
mapping in reference to Eq. [57] 

We also constructed a set of mapping functions using mean field [3j| arguments as follows. In 



the absence of long range order, mean field theory suggests that the interacting dipole-fiuid system 
(7 7^ 0) in an electric field of strength E can be approximated by a system of non-interacting dipoles 
(7 = 0) in an effective field of strength E' . We obtained approximate values for this effective electric 
field by first numerically evaluating the single-dipole distribution -P(C) j C = cos(0), aX E = Ef. The 
thermal distribution of Q for a non-interacting dipole in a field of strength E'p is -Po(C) oc exp(/3£"^(^). 
Hence E'^ can be estimated by fitting Pq to the numerically obtained distribution P{C)- For all 
other values of E, we calculate the effective fields by linear scaling, E' = EE'^/Ef. Again, using 
Eq. [30] with He{'2') = HE'fliz) we obtain a new set of mapping functions. In particular, when 
the E field is switched from Ei to the Cfc = cos(^fc) degree of freedom of the A;*^ dipole is 

transformed according to Eq. [58l 



miCk) 



1 



-In 



sm\i{l3E[) ^ 



(58) 
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FIG. 5. Work histograms obtained from forward and reverse simulations performed at 7 = 0.1. The degree 
of overlap between Pf{W) and Pfj{—W) provides an indication of the efficiency of the free energy estimate. 
For unescorted simulations (no mapping) we see no overlap, reflecting considerable dissipation and poor 
efficiency (Table . With the mapping given by Eq. [57] the overlap is much improved, and with the mean 
field mapping, Eq. [58] the forward and reverse distributions are nearly identical. 

We will refer to Eq. [58] as a mean field mapping. Since the single-dipole distributions for the 
interacting system at field strength E are (by construction) closely approximated by the single- 
particle distributions for the non-interacting system at E' , we expect the mean field mappings to 
perform better than the simple mappings of Eq. [571 

We performed numerical simulations with Up = 800 particles. The parameters a, e of the 
Lennard- Jones potential set the length and the energy scale of the system, and we took L = 10a and 
T* = ksT/e = 1. Minimum image convention and periodic boundary conditions [l| were used. We 
performed = 10^ forward and reverse simulations to estimate the free energy difference between 
the ensembles corresponding to -E = and E = 1, switching the field strength in = 10 equal 
increments. Ten Monte Carlo sweeps were performed between these updates in E. We obtained 
estimates of AF using: (1) unescorted switching simulations (Eq. [T|), (2) escorted simulations 
with the simple mappings (Eq. I57p . and (3) escorted simulations with the mean field mappings 
(Eq. I58|) . For the latter, the effective fields were obtained as described in the previous paragraph. 
In particular, we found E'j ~ and therefore we took E'- = 1.5Ei in Eq. [58] 

Fig. [5] shows the work distributions Pf{W) and Pji{—W) for these sets of simulations, and 



I No mapping 

I Simple mapping 
I Mean field mapping 

1 

I 
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No mapping 


Mapping 


Mean field mapping 


{W)f 


-60.409 ±0.126 


-177.074 ±0.039 


-189.079 ±0.010 


{W)r 


302.958 ±0.132 


200.607 ±0.045 


189.971 ±0.010 


{W)f + {W)r 


242.549 ±0.182 


23.533 ±0.060 


0.892 ±0.014 


^pest 


-114.189 ±3.913 


-187.612 ±0.405 


-189.552 ± 0.011 


^pest 


262.232 ±0.711 


191.877 ±0.310 


189.502 ±0.0140 


^Fbar 


-128.215 ±3.324 


-189.599 ±0.110 


-189.530 ±0.008 


c 


- 


0.011 ±0.001 


0.407 ±0.001 



TABLE II. Estimates and Figures of Merit for 7 = 0.1. Note that the simulations with the mapping are much 
more efficient than those without. The forward and reverse work histograms obtained from the simulations 
without any mappings were so far apart that a reliable estimate of C could not be obtained. 



reveals a progression from virtually no overlap for the unescorted simulations, to some overlap for 
the simulations with the simple mappings, to nearly perfect overlap when using the mean field 
mappings. This trend is in agreement with the expectations mentioned above, and provides direct 
evidence that the mappings we have constructed substantially reduce the lag and dissipation. The 
first three rows of Table |TI] quantify these observations. In particular, row 3 gives the distance 
between the means of PpiW) and Pr(— VF), and shows that this hysteresis proceeds from nearly 
250^5 r to about lAksT to less than IksT in the three cases. Rows 4 to 6 illustrate the effect of 
this trend on the efficiency and accuracy of the free energy estimates. The estimates of AF (that 
is, AFp?^, —AFf^^, and AF^j^^^ obtained from the unescorted simulations differ substantially 
from one another, indicating a high degree of bias. The estimates corresponding to the simple 
mappings are markedly better, though they still suggest a degree of bias on the order of IksT. 
Finally, the simulations with the mean field mappings are in agreement to within about COS/c^T, 
indicating excellent accuracy and efficiency. These findings are also in agreement with the values 
of the overlap integral C, shown in row 7. This was too low to be estimated using the unescorted 
simulations, and approaches its maximal value of 1/2 when using the mean field mappings. Using 
escorted simulations with the mean field mappings, with the acceptance ratio method (BAR), we 
found that we were able to generate estimates of AF with error bars on the order of 0.2kBT, with 
about Ns '-^ 1 / '--^ 10 (data not shown). 
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VI. SUMMARY 

Nonequilibrium fast switching estimates of free energy differences often perform poorly due to 
dissipation (see Fig[T]). The strategy developed here seeks to address this issue. By modifying the 
dynamics with additional terms that serve to escort the system along a near equilibrium trajectory 
and consequently reduce dissipation, we obtain efficient fast switching estimators (Eq. [T9]l for the 
free energy difference. The success of the strategy depends crucially on the choice of the mapping 
functions Adf. the more effectively these reduce the dissipation, the more efficient the resulting 
estimator of AF. 

The examples presented in Section |V] illustrate this point. For the hard sphere solute, we 
used a simple mapping function that uniformly compresses the solvent, vacating the region into 
which the hard sphere expands (Eq. IST]) . With this escorting function we were able to estimate 
AF directly from single-stage switching simulations, which would not have been feasible without 
escorting. In the example of the Lennard-Jones dipole fluid, we used a reference system of non- 
interacting dipoles to construct a reasonable set of mapping functions (Eq J57p . and then we further 
refined these mappings using mean field arguments (Eq. [58]) . Figure [5] and Table |TI] illustrate the 
correlation between reduced dissipation and increased computational efficiency. Because mean field 
theory often provides an good description of many-body systems, we speculate that this approach 
will prove effective for more complex problems of physical interest. 

We have also discussed figures of merit, specifically the dissipation in the forward and reverse 
processes, and the overlap integral C (Eqs. El IIU 147]) . For the two examples in Section |Vl we 
found that these quantities indeed track the effectiveness of the mapping functions. This suggests 
that these figures of merit might be useful to iteratively improve the performance of the mapping 
functions. 

Finally, the efficiency of our method might further be improved by applying it in combination 



with other methods, such as biased or umbrella sampling algorithms (see e.g. Refs j35l438l|) 
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Appendix A: Appendix 

Here we derive a relation between Ns and C for the bidirectional estimator, Eq. [38j The Bennett 
estimator, Eq. [38] can be rewritten as a ratio of two free energy perturbation identities [s^ 

{Ph{W)/Pf{W))p^^w) ^ ^ 
{Ph{W)/Pr{-W))p^^_w) ^ ' 

where (. . . )p^(h/) denotes an average over W values sampled from Pf{W), (. . . )p^(_vv) denotes 
an average over W values sampled from Pr{—W), PniW) = pp!(|y)^^^(l^) with C = 
J dW p^^\l^p^f^^^\^ is the normalized harmonic mean distribution. As the averages in the nu- 
merator and the denominator are over different ensembles, let us separately consider the number 
of the realizations required for each to converge. 

The dominant contributions to the average in the numerator come from work values that 
are typically sampled from the harmonic mean distribution Ph [6|]. The probability that these 
dominant values are observed in the forward process can be given by P = ^Typicai'^^ ^^^^^^ ~ 
Ixypicai PhPf(W) / Ph , where J^ypicai denotes that the integration is performed over the range 
of W values that are typically sampled from the harmonic mean distribution {PniW)). 



Following Ref 



we now write 



P~ / dP^P^^e"^ ~e<'"^^^ / dWPn^e'"'''^^" (A2) 



' Typical J Typical 

The number of realizations Ng required for adequate sampling can be roughly given by Ng ~ ~ 
exp Z)[Ph'||Pp], where we have used — (In-^)/^ = D[Ph\\Pf]- The relative entropy L'[Pf/||PF] 
satisfies the following inequality 

1 PrPf , Pr 



D[Ph\\Pf] = [ - ^^^^ In 
L ^11 J CPr + Pf 



< In 



CPr + Pf C{Pf + Pr) 
P 



1 4PIPf 



4C^P^ + Ppf (A3) 



= -21n2C 



where we have used the Jensen's inequality [27l | for concave functions together with the identity 
APfPr < {Pf + Pfi)^ ■ Finally, using Eq. IA31 the number of realizations required to obtain a 
reliable estimate of AF using Bennett's method is bounded by 

Ns<-^ (A4) 
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We have not included the numerical factors in the above relation as it is already an approximate 
equation. 
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